Skip to content

Allow Content-Encoding header to have no whitespace#253

Open
p8 wants to merge 1 commit intoMDA2AV:mainfrom
p8:validate-encoding-header
Open

Allow Content-Encoding header to have no whitespace#253
p8 wants to merge 1 commit intoMDA2AV:mainfrom
p8:validate-encoding-header

Conversation

@p8
Copy link
Copy Markdown
Contributor

@p8 p8 commented Mar 29, 2026

Some framework return headers without whitespace:

content-type:application/json
content-encoding:gzip

The check_header method already takes this into account, but the encoding header check does not.

@p8
Copy link
Copy Markdown
Contributor Author

p8 commented Mar 29, 2026

/validate rails

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ Couldn't detect which framework to test. Either modify files under frameworks/<name>/ or specify explicitly: /benchmark -f <framework> -t <test>

Copy link
Copy Markdown
Collaborator

@BennyFranciscus BennyFranciscus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey p8, good catch on the whitespace-less headers! 👍

One issue though — the fix still has awk '{print tolower($2)}' at the end of the pipe, but after the sed 's/^[^:]*: *//' strips the header name + colon, the remaining value (e.g. "gzip") becomes $1 not $2.

So with the current fix:

  • "content-encoding: gzip" → sed → "gzip" → awk $2 = empty → FAIL
  • "content-encoding:gzip" → sed → "gzip" → awk $2 = empty → FAIL

Should be $1 instead of $2:

comp_encoding=$(echo "$comp_headers" | grep -i "^content-encoding:" | sed 's/^[^:]*:[[:space:]]*//' | tr -d '\\r' | awk '{print tolower($1)}' || true)

Or simpler — keep the awk-only approach but make it handle both formats:

comp_encoding=$(echo "$comp_headers" | grep -i "^content-encoding:" | tr -d '\\r' | awk -F: '{gsub(/^[[:space:]]+/, "", $2); print tolower($2)}' || true)

This uses : as the field separator so $2 is everything after the colon, then strips leading whitespace.

@MDA2AV
Copy link
Copy Markdown
Owner

MDA2AV commented Mar 29, 2026

Ok, this is RFC valid.

@BennyFranciscus
Copy link
Copy Markdown
Collaborator

Hey @p8, the intent here is right but I think there's a bug in the sed+awk pipeline.

After sed 's/^[^:]*: *//' strips the header name, the value becomes $1 for awk, not $2:

# Input: "content-encoding:gzip"
# After sed: "gzip"
# awk {print $2} → empty (gzip is $1)

# Input: "content-encoding: gzip"  
# After sed: "gzip"
# awk {print $2} → empty (same issue)

Simplest fix — either change $2 to $1:

comp_encoding=$(echo "$comp_headers" | grep -i "^content-encoding:" | sed 's/^[^:]*: *//' | tr -d '\\r' | awk '{print tolower($1)}' || true)

Or skip sed entirely and use a more flexible awk delimiter:

comp_encoding=$(echo "$comp_headers" | grep -i "^content-encoding:" | tr -d '\\r' | awk -F ":[[:space:]]*" '{print tolower($2)}' || true)

The second approach handles both content-encoding: gzip and content-encoding:gzip in one step.

@p8 p8 force-pushed the validate-encoding-header branch from 18db923 to 5ab12f0 Compare March 29, 2026 19:26
Copy link
Copy Markdown
Collaborator

@BennyFranciscus BennyFranciscus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now 👍 The sed+awk pipe handles both cases cleanly.

Some framework return headers without whitespace:

    content-encoding:gzip

The `check_header` method already takes this into account, but the
encoding header check does not.
@p8 p8 force-pushed the validate-encoding-header branch from 5ab12f0 to f19228c Compare March 30, 2026 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants